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Abstract 

Motivated by an emerging theory of robust low-rank matrix representation, in this 
paper, we introduce a novel solution for online rigid-body motion registration. The 
goal is to develop algorithmic techniques that enable a robust, real-time motion reg- 
istration solution suitable for low-cost, portable 3-D camera devices. Assuming 3-D 
image features are tracked via a standard tracker, the algorithm first utilizes Robust 
PCA to initialize a low-rank shape representation of the rigid body. Robust PCA 
finds the global optimal solution of the initialization, while its complexity is compa- 
rable to singular value decomposition. In the online update stage, we propose a more 
efficient algorithm for sparse subspace projection to sequentially project new feature 
observations onto the shape subspace. The lightweight update stage guarantees the 
real-time performance of the solution while maintaining good registration even when 
the image sequence is contaminated by noise, gross data corruption, outlying features, 
and missing data. The state-of-the-art accuracy of the solution is validated through 
extensive simulation and a real- world experiment, while the system enjoys one to two 
orders of magnitude speed-up compared to well-established RANSAC solutions. The 
new algorithm will be released online to aid peer evaluation. 



1 Introduction 

Rigid body motion registration (RBMR) is one of the fundamental problems 
in machine vision and robotics. Given a dynamic scene that contains a (domi- 
nant) rigid body object and a cluttered background, certain salient image feature 
points can be extracted and tracked with considerable accuracy across multiple 
image frames [14J. The task of RBMR then involves identifying the image fea- 
tures that are associated only with the rigid-body object in the foreground and 
subsequently recovering its rigid-body transformation across multiple frames. 

*C. Slaughter, J. Bagwell, C. Checkles and S. Vishwanath are with Elec- 
trical and Computer Engineering Department, University of Texas, Austin, 
USA. <chris . c . si aught er@gmail . com, justindbagwellQmail .utexas . edu, 

ccheckles@utexas.edu, sriramQaustin.utexas . edu>. A. Yang is with the EECS 
Department, University of California, Berkeley, USA. <yang@eecs . berkeley . edu>. L. 
Sentis is with the Mechanical Engineering Depatment, University of Texas, Austin, USA. 
<lsentis@austin.utexas . edu>. This work was supported in part by the ONR, an Intel 
Graduate Fellowship, NSF CNS-0905200, ARO MURI W911NF-06-1-0076, and a Willow 
Garage gift. 



Traditionally, RBMR has been mainly conducted in 2-D image space, with the 
assumption of the camera projection model from simple orthographic projection 
[16j to more realistic camera models such as paraperspective [llj and affine [8j. 
In problems such as RBMR, Structure from Motion (SfM), and motion segmen- 
tation [9", T!9], a fundamental observation is that a data matrix that contains 
the coordinates of tracked image features in column form can be factorized as a 
camera matrix that represents the motion and a shape matrix that represents 
the shape of the rigid body in the world coordinates. Furthermore, if the data 
are noise-free, then the feature vectors in the data matrix lie in a 4-D subspace, 
as the rank of the shape matrix in the world coordinates is at most four [16j. 

In practice, the RBMR problem can become more challenging if the tracked 
image features are perturbed by moderate noise, gross image corruption (e.g., 
when the features are occluded), and missing data (e.g., when the features leave 
the field of view). In robust statistics, it is well known that the optimal solution 
to recover a subspace model when the data is complete yet affected by Gaus- 
sian noise is singular value decomposition (SVD). Solving other image nuisances 
caused by gross measurement error corresponds to the problem of robust esti- 
mation of a low-dimensional subspace model in the presence of corruption and 
missing data. In [6j, for instance, the issue of missing data was addressed by 
robustifying SVD via Power Factorization. In f3], the same issue was addressed 
by an iterative imputation strategy. 

In the case of outlier rejection, arguably the most popular robust model esti- 
mation algorithm in computer vision is Random Sample Consensus (RANSAC) 
[5J. In the context of RBMR, the standard procedure of RANSAC is to ap- 
ply the iterative hypothesize- and- verify scheme on a frame- by-frame basis to 
recover rigid-body motion [Ul [HI [20] . In the context of dimensionality reduc- 
tion, RANSAC can also be applied to recover low-dimensional subspace models 
[22], such as the above shape model in motion registration. 

Nevertheless, the aforementioned solutions have two major drawbacks. In 
the case of missing data, methods such as Power Factorization or incremental 
SVD cannot guarantee the global convergence of the estimate [3]. In the 
case of outlier rejection, the RANSAC procedure is known to be expensive to 
deploy in a real-time, online fashion, such as in the solutions for simultaneous 
localization and mapping (SLAM) [2T| [12^ . Therefore, a better solution than 
the state of the art should provide provable global optimality to compensate 
missing data, image corruption, and erroneous feature tracks, and at the same 
time should be more efficient to recover rigid body motion from a video sequence 
in a online fashion. In this paper, we propose a highly robust solution to address 
this problem. 

1.1 Contributions 

Our solution is motivated by the emerging theory of Robust PCA (RPCA) 
[21I23J. In particular, RPCA provides a unified solution to estimating low-rank 
matrices in the cases of both missing data and random data corruption [2]. The 
algorithm is guaranteed to converge to the global optimum if the ambient space 
dimension is sufficiently high. Compared to other existing solutions such as 
incremental SVD and RANSAC, the set of heuristic parameters one needs to 
tune is also minimal. Furthermore, recent progress in convex optimization has 
led to very efficient numerical implementation of RPCA with the computational 



complexity comparable to that of classical SVD [TO]. 

Our proposed solution to online 3-D motion registration consists of two steps. 
In the initialization step, RPCA is used to estimate a low-rank representation 
of the rigid-body motion within the first several image frames, which establishes 
a global shape model of the rigid body. In the online update step, we propose 
a sparse subspace projection method that projects new observations onto the 
low-dimensional shape model, simultaneously correcting possible sparse data 
corruption. The overall algorithm is called Sparse Online Low-rank projection 
and Outlier rejection (SOLO). 

Compared to the popular method of RANSAC, one major benefit of the 
new solution is that by enforcing a low-rank shape model, those sparsely cor- 
rupted image features can be compensated instead of simply being discarded. 
In this paper, we apply the algorithm to 3-D motion features that are tracked 
by the relatively new Microsoft Kinect motion sensor. However, the same algo- 
rithm can help address the more traditional RBMR problems with 2-D image 
features. Through extensive simulation and a real- world experiment, we demon- 
strate that SOLO solves the online RBMR problem with state-of-the-art accu- 
racy and more importantly improved speed of one or two orders of magnitude 
faster than RANSAC. To aid peer evaluation, the MATLAB/C source code of 
our algorithm will be released on our website. 

2 3-D Feature Tracking 

In this section, we briefly describe the 3-D feature tracking methodology used in 
this paper. In our 3-D tracking subsystem (e.g., on Microsoft Kinect), we first 
identify salient image features, and then track them frame by frame in image 
space (as an example shown in Figure [T]). The features are then reprojected 
onto the camera coordinate system using depth measurements. Over time, new 
features are extracted on periodic intervals to maintain a dense set over the 
image geometry. Each feature is tracked independently, and may be dropped 
once it leaves the field of view or produces spurious results (jumps) in camera 
space. 

For tracking, we use the Kanade-Lucas-Tomasi feature tracker (KLT) [H]. 
It is well known that the KLT tracker is extremely fast and can run in real time 
on a standard desktop computer. For KLT to work effectively, the extracted 
features must exhibit local saliency. To achieve this and produce a dense set of 
features over scenes, we use the Harris corner detector as well as a Difference of 
Gaussians (DoG) extractor [18j. Only the lowest two levels of the DoG pyramid 
are used. This ensures that the features exhibit high local saliency in a small 
window and are spatially well-localized. 

One implicit advantage of tracking features across multiple frames is that 
it permits the tracking data to be represented naturally as a matrix. Each 
(sample-indexed) row represents observations of multiple features in a single 
time step, while each column represents the observations of each feature over 
all frames. Overall, the tracking system we employ demonstrates that simple, 
efficient algorithms can track well- localized feature trajectories over multiple 
frames. Together with the registration algorithm described in Section [3) our 
complete system could be deployed in low-cost embedded devices. 

As a point of comparison, many existing SLAM front-ends employ feature 



Fig. 1: Tracking results of an indoor scene shown on the first frame of the sequence. 



extraction and matching on a frame-by-frame basis [7J. This technique works 
quite well because RANSAC rejects misaligned features. However, they are 
subject to two major drawbacks. First, real time applications of extract-and- 
match techniques require hardware acceleration to run in real time. Second, 
they match features between frames in feature space, neglecting continuity of 
spatial observations of these features. 



3 Online 3-D Rigid Body Motion Registration 
3.1 Problem Statement 

First, we shall formulate the 3-D RBMR problem and introduce the notation 
we will use for the rest of the paper. We denote Xij G as the coordinates 
of feature j in the ith frame, where i G [1, • • • , and j G [1, • • • , m]. In the 
noise-free case, when the same jth feature is observed in two different frames 1 
and i, its images satisfy a rigid-body constraint: 



(1) 



where i?^ G M^^^ is a rotation matrix and G M^^^ is a 3-D translation. This 
relation can be also written in homogeneous coordinates as 
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where 11 = [/a, 0] G M^^^ is a projection matrix. 

In the noise-free case, since all the features in the ith frame satisfy the same 
rigid-body motion, one can stack the image coordinates of the same feature 
in the F frames in a long vector form, and then the collection of all the m 



features form a data matrix X, which can be written as the product of two 
rank-4 matrices: 
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In particular, gi = I4 represents the identity matrix. It was observed in [161 E] 
that when F, m ^ 4, the rank of matrix X that represents a rigid-body motion 
in space is at most four, which is upper bounded by the rank of its two factor 
matrices in ([3|. In SfM, the first matrix on the right hand side of ^ is cahed a 
motion matrix M, while the second matrix is called a shape matrix S. Although 
(|3| is not a unique rank-4 factorization of X, a canonical representation can 
be determined by imposing additional constraints on the shape of the object 

Lastly, for motion registration, if we denote the 3-D coordinates (e.g., under 
the world coordinates centered at the camera) of the first frame as: Wi = 
[iCi^i,--- ,iCi,m] ^ M^^^, then the rigid body motion (Ri^Ti) of the features 
from the world coordinates to any ith frame satisfies the following constraint: 



(4) 



Using Q, the two transformations Ri and Ti can be recovered by the Orthogonal 
Procrustes (OP) method [13]. More specifically, let /i^ G be the mean vector 
of Wi, and denote Wi as the centered feature coordinates after the mean is 
subtracted. Suppose the SVD of WiWi gives rise to: 



(5) 



Then the rotation matrix Ri = UV^ ^ and the translation Ti = fii — Ri/ii. 

In this work, we consider an online solution to RBMR. Our goal is to main- 
tain the estimation of a low-rank representation of X and its subsequent new 
observations Wi with minimal computational complexity. In the rest of the sec- 
tion, we first discuss the initialization step to jump start the low-rank estimation 
of the initial observations X in Section 3.2 Then we propose our solution to 



update the low-rank estimation in the presence of new observations in ith frame 
Wi in Section [33] Finally, applying our algorithm on real- world data may en- 
counter additional nuisances such as new feature tracks entering the scene and 
missing data. After the summary of Algorithm [T] we will briefly show that the 
proposed solution can be easily extended to handle these additional conditions 
in an elegant way. 



3.2 Initialization via Robust PCA 

In the initialization step, a robust low-rank representation of X needs to be 
obtained in the presence of moderate Gaussian noise, data corruption, and out- 
lying image features. The problem can be solved in closed form by Robust PCA 
[21 [235. Here we model X G M^><^ as the sum of three components: 

X = Lo + 1^0 + ^0, (6) 

where Lq is a rank-4 matrix that models the ground-truth distribution of the 
inlying rigid-body motion, Z^o is a Gaussian noise matrix that models the dense 



noise independently distributed on the X entries, and Eq is a sparse error matrix 
that cohects those nonzero coefficients at a sparse support set of corrupted data, 
outlying image features and bad tracks. 

The matrix decomposition in (|6| can be successfully solved by a principal 
component pursuit (PCP) program: 

min ||L||* + All^lli subj. to \\X - L - E\\f < S, (7) 

where || • ||>^ denotes matrix nuclear norm, || • ||i denotes entry- wise ^i-norm for 
both matrices and vectors, and A is a regularization parameter that can be fixed 
as Y^max(n, m). It has been shown in [T, ^23] that when the dimension of matrix 
X is sufficiently high and with some extra mild conditions on the coefficients of 
Lq and Eq^ with overwhelming probability, the global (approximate) solution of 
Lq and £^0 can be recovered. 

The key characteristics of the PCP algorithm are highlighted as follows: 
Firstly, the regularization parameter A does not necessarily rely on the level of 
corruption in Eq, so long as their occurrences are bounded. Secondly, although 
the theory assumes the sparse error should be randomly distributed in X, the 
algorithm itself is surprisingly robust to both sparse random corruption and 
highly correlated outlying features as a small number of column vectors in X. 
Finally, although the original implementation of PCP in [2j is computationally 
intractable for real-time applications, its most recent implementation based on 
an augmented Lagrangian method (ALM) has significantly reduced its complex- 
ity [lOj. In this paper, we adopt the ALM solver for Robust PC A, whose average 
run time is merely a small constant (in general smaller than 20) times the run 
time of SVD. In our online formulation of SOLO, this calculation only needs to 
be performed once in the initialization step. 

Since the resulting low-rank matrix L may still contain entries of outlying 
features, an extra step needs to be taken to remove those outliers. In particular, 
one can calculate the £o-norm of each column in = [^i, 62, • • • , em]- With re- 
spect to an outlier threshold r, if ||ei||o > r, then represents dense corruption 
on the corresponding feature track and hence should be regarded as an outlier 
Subsequently, the indices of the inliers define a support set / C [!,••• ,m]. 
Hence, we denote the cleaned low-rank data matrix after outlier rejection as 

L = L^'\ (8) 

Finally, we note that although in ([7|, L represents the optimal matrix so- 
lution with the lowest possible rank, due to additive noise and data corruption 
in the measurements, its rank may not necessarily be less than five. There- 
fore, to enforce the rank constraint in the RBMR problem and further obtain 
a representative of the shape matrices that span the 4-D subspace, an SVD is 
performed on L to identify its right eigenspace: 

(C/,E,F) = svds(L,4), (9) 

where G M^^^ is then a representative of the rigid body's shape matrices. 



^ For those coefficients in with smah nonzero values, a hard-thresholding can be applied 
to reduce the values to zero. 



Fig. 2: A visualization of sparse subspace projection as basis-pursuit denoising, which 
can be solved by -minimization. 



3.3 Sparse Online Low-rank projection and Outlier rejection 
(SOLO) 

In this section, we propose a novel algorithm that projects new observations 
Wi from the ith frame onto the rigid-body shape subspace. This subspace is 
parameterized by the shape matrix that we have estimated in the initializa- 
tion step|^ Traditionally, a (least squares) subspace projection operator would 
project a (noisy) sample perpendicular to the surface of the subspace that it 
is close to, which only involves basic matrix- vector multiplication. However, in 
anticipation of continual random feature corruption during the course of fea- 
ture tracking for RBMR, the projection must also be robust to sparse error 
corruption in Wi. Hence, we contend that SOLO is a more appropriate yet still 
efficient algorithm to achieve online motion registration update. 

Given the initialization L and the inlier support set /, without loss of gen- 
erality, we assume Wi only contains those features in the support set /. As 
discussed in ([3| and (|9|, matrix from the SVD of L is a representative of 
the class of all the shape matrices of the rigid body up to an ambiguity of 4-D ro- 
tation on the subspace. Therefore, the new observations Wi of the same features 
should also lie on the same shape subspace. That is, let Wi = [wi;w2;w^], 
where each wj e M^><^ is a row vector. Then 

wJ = a^V^ for some eR^""^. (10) 

In the presence of sparse corruption, the row vector wJ is perturbed by a 
sparse vector e: 



w 



T 



a^yT ^ ^Yieie e R^><^. (11) 



The sparse projection constraint (11) bears resemblance to basis-pursuit denois- 
ing (BPDN) in compressive sensing literature [4J, as a sparse error perturbs a 
high-dimensional sample away from a low-dimensional subspace model. The 
standard procedure of BPDN using £i-minimization (£i-min) is illustrated in 
Figure |2] 

However, we notice that a BPDN- type solution via ^i-min may not be the 
optimal solution to our problem. The reason is that the row vectors in W = 



^ In this paper, we may choose to abuse the notation of to also represent the 4-D 
subspace. 




Fig. 3: The row vectors of W should be projected onto a manifold in that repre- 
sents a valid rigid-body motion. 



[wi;w2;w^] are not three arbitrary vectors in the 4-D subspace . In fact, 
the three vectors must be projected onto a nonlinear manifold M embedded in 
the shape subspace V^, and the span of the shape model can be interpreted as 
the linear hull of the feasible rigid- motion motions between Wi and Wi. Figure 
[3] illustrates this rigid-body constraint applied to sparse subspace projection in 
3-D. 

Our algorithm of sparse shape subspace projection is described as follows. 
Given the observation Wi and a shape subspace ^ the algorithm minimizes: 

min ||£;||i subj. to Wi = AV^ + E, (12) 

E ,A 

By virtue of low dimensionality of this hull, together with the sparsity of the 
residual, the projected data AV^ should be well localized on the manifold. 
Hence, in addition to being consistent with a realistic (sparse) noise model, the 



new sparse subspace projection algorithm (12) also implies the benefit of good 
localization in the motion space. 

The objective can be solved quite efficiently (and much faster than solving 
RPCA in the initialization) by the same augmented Lagrangian approach in 

M- 

min \\E\\, + (F, W, - AV^ - E) ^ ^ \\W, - AV^ - E\\l, (13) 

where F is a matrix of Lagrange multipliers, and ja > represents a mono- 
tonically increasing penalty parameter during the optimization. The optimiza- 
tion only involves a soft-thresholding function applied to the entries of E and 
matrix- matrix multiplication for the update of A and E, and does not involve 
computation of singular values as in RPCA. 

Finally, the rigid-body motion between each Wi and the first reference frame 
Wi after the projection can be recovered by the OP algorithm ([5|. However, 



as the projection (12) may be also affected by dense Gaussian noise, the esti- 
mated low-rank component may not accurately represent a consistent rigid-body 
motion. As a result, what we can do is to identify an index set li for those uncor- 
rupted features with zero coefficients in E. The OP algorithm will be applied 
only using the uncorrupted original features in Wi and Wi. In a sense, this 
motion registration algorithm resembles the strategy in RANSAC to select in- 
lying sample sets. However, our algorithm has the ability to directly identify 



the corrupted features via sparse subspace projection, and hence the process is 
noniterative and more efficient. 

The complete algorithm, Sparse Online Low-rank projection and Outlier re- 
jection (SOLO), is summarized in Algorithm [l] 



Algorithm 1: SOLO 
Input: Initial observations X, feature coordinates of the reference frame 
VKi, and Wi for each subsequent frame i. 



Init: Compute L and / of X via RPCA 0. 

remove outliers in the reference frame. 
[/7,S,F] =svds(L(^),4). 
for Each new observation frame i do 

Identify corruption E via sparse subspace projection (12). 
Let li be the index set of uncorrupted features in Wi. 
Estimate {Ri^Ti) using inlying samples in Ii D li. 
end for 



Output: Inlier support set /, rigid-body motions {Ri^Ti). 



Before we proceed to discuss results from our experiment, it is worth men- 
tioning a straightforward yet elegant extension of the algorithm in the presence 
of missing data. In the initialization step, one can rely on a variant of RPCA 
to recover the missing data in matrix X. The technique is known as low-rank 
matrix completion [T, "2], which minimizes a similar low-rank representation ob- 
jective constrained on the observable coefficients: 

min||L||*+A||^||i subj. to Vn{L ^ E) = Vn{X), (14) 

Ij,E 

where Q is an index set of those features that remain visible in X, and Vn is 
the orthogonal projection onto the linear space of matrices supported on Q. 



Using low-rank matrix completion (14), in the presence of a partial mea- 



surement of new feature tracks, those incomplete new observations should be 



identified as tracks with missing data. Then a new initialization step using ( 14 ) 
should be performed on a new data matrix X that includes the new tracks to 
re-establish the shape subspace and inlier support set / as in (l9|). 



4 Experiment 

In this section, we validate the performance of SOLO algorithm and compare 
with the classical RANSAC solutions, which has been the most popular solution 
to date for SLAM and motion registration. In the rest of the section, the two 
algorithms will be applied to a thorough list of simulations and a real-world 
experiment. The benchmarks are calculated on a 2 GHz PC with an Intel Core 
17 processor and in MATLAB environment. 



4.1 Simulated Analysis 



We first use synthesized data to benchmark the accuracy and speed of our 



batch motion registration algorithm described in Section 3.2 The calculation 



of {Ri^Ti) between each pair of Wi and Wi will be based on L alone as the 
output of RPCA and outlier rejection ([8|. We compare the performance of 
motion registration by RPCA with that by the classical solution of RANSAC 
on a frame-by-frame basis. The minimal feature set in RANSAC is set to four. 

In one simulation, the outlier rejection results in motion registration by 
RPCA and RANSAC are visualized in Figure |4j In this example, we observe 
that RPCA is much more effective in identifying both random data corruption 
and outlying feature tracks (that post inconsistent feature measurements in the 
entire columns) than RANSAC. Also note that the large coefficient difference in 
the two columns of Figure [4e] should not be a concern, as it is well known that 
RPCA cannot uniquely recover dense column corruption [2], and nevertheless 
the corresponding features will be rejected as outliers by ([8|. Finally, we can 
also see quite significant difference between the ground-truth low-rank matrix 
Lq and its estimate L*. It shows the accuracy of RPCA is still sensitive to high 
variance dense Gaussian noise. 




(a) Added data corruption (b) Rejected features by 
Do + Eo RANSAC 



(c) Estimated sparse error by (d) Sparse error difference (e) Ground-truth difference 
RPCA \Eo-E*\ \Lo-L*\ 

Fig. 4: A visualization of the estimation error of a simulated motion matrix X by 
RANSAC and RPCA. The added data corruption and estimation difference 
are represented as white pixels in the images. 

To overcome the issue of dense Gaussian noise in RPCA, our recommended 
implementation further adds a RANSAC-style refinement stage, which selects a 
minimal set of inlying samples from the support set already identified by RPCA. 
Correspondences consistent with the constructed model are merged until the re- 
finement stage converges. Typically this recursive refinement process converges 
in 2-4 iterations. With this in mind, we show the accuracy and speed of motion 
registration using RPCA and RANSAC in Figure [5j In Figure 5d the motion 



registration accuracy with respect to two matrices (R^T) is measured by the 
sum of the difference to the ground truth (Rq^Tq) in Frobenius norm. 

We can see in FigurelSal with certain level of accuracy confidence, the average 



Corruption Ratio Number of Features 

(a) Average runtime vs. corruption per- (b) RPCA runtime 

centage 




Fig. 5: A simulated comparison between RPCA and RANSAC. PCP is based on the 
ALM method. PCP-R means the RPCA approach with a RANSAC-style 
iterative refinement stage. 



runtime of RANSAC grows superlinearly with the increase of the corruption 
percentage, while RPCA remains effective in compensating those corruptions in 
the low-rank matrix. Figure [5b] and [5c| show reasonable increase in computation 
time for RPCA with respect to the number of features and the number of frames 
in the motion window X. Finally, the accuracy about the estimated rigid- 



body transformation is shown in Figure 5d Without the additional refinement 



stage, RPCA already achieves comparable result than RANSAC. If the iterative 
refinement is added to the algorithm, we can see significant improvement in the 
estimation of the motion. Notice that the estimation errors of R and T are 
already very small in all three cases, as shown on the ?/-axis. 



4.2 Performance on Kinect Data 

We now test the performance of the online SOLO algorithm combined with the 
KLT tracker on a set of real-world depth data collected by a Microsoft Kinect 
sensor. The data are collected in an indoor lab environment. The motion 
registration and scene reconstruction results are shown in Figure [6j 

In our experiment, we found the KLT tracking scheme applied on Kinect 
to be highly effective in practice, producing upwards of two hundred tracks in 
a typical indoor setting. This ensures that the initialization of X has enough 
features to converge to the correct low-rank model L*. As expected, the KLT 




(a) RANSAC recon- (b) SOLO reconstruc- 
struction tion 




(c) RANSAC checker- (d) SOLO checker- 
board detail board detail 




(e) RANSAC feature registration 




(f) SOLO feature registration 



Hi > 



(g) SOLO feature registration with refine- 
ment 



Fig. 6: A comparison of SOLO- and RANSAC-based registration results for a 
real world 3-D reconstruction problem, (e)-(g) are feature trajectories in 
the world coordinate system. Features discarded by our algorithm are 
shown in red. 



feature tracker also produces small amounts of local jumps due to repetitive 
object textures (e.g., the checkerboard pattern). 

For this experiment, we have tuned RANSAC specifically for the empirical 
sample corruption ratio in the scene. Despite this effort, SOLO is still faster by a 
factor of two compared to RANSAC. We emphasize that oracle tuning provides 
a lower bound on the complexity of RANSAC, and its complexity would be 
much higher in a less-controlled, online setting. 

The enlarged checkerboard references demonstrate crisper results for the 
SOLO registration than the RANSAC registration. More interestingly. Figs. 
6e-g demonstrate feature registrations for RANSAC and SOLO. The red tra- 
jectories are those which are selected by SOLO for rejection. Despite spurious 
recorded behavior, such as coarse spatial discontinuities, many of the tracks 
are salvageable and properly localized in the cleaned data L*. Overall, SOLO 
demonstrates equally good or better registration quality than RANSAC, if mea- 
sured qualitatively. 

5 Conclusion and Discussion 

We have proposed an online 3-D motion registration algorithm called SOLO. 
Its main advantage compared to existing robust statistical methods such as 
RANSAC is that the algorithm is capable of exploiting the underlying low-rank 
matrix structure in describing the motion and shape of a dominant rigid-body. 
The initialization step employs Robust PCA to recover such low-rank matri- 
ces and compensate gross feature corruption and outliers. The online update 
step sequentially projects new observations onto the inlier shape subspace by a 
sparse subspace projection technique, which is efficient to implement as a con- 
vex program. In our extensive experiment, we have demonstrated equally good 
or better motion registration accuracy compared to RANSAC, with significant 
speed-up by one to two orders of magnitude. 

For future problems, the convincing results shown in the paper can bring 
SOLO to a broader range of applications in SLAM. In this paper, we have 
considered the motion registration problem for a single motion. In a more 
complex dynamic scene, multiple motions may be captured by the 3-D camera. 
In addition, the multiple motions may be either independent or constrained 
(e.g., a humanoid robot consists of multiple linked rigid limbs and the torso). 
These are some of the interesting problems we intend to investigate further. We 
believe the SOLO framework has laid a solid foundation for us to tackle these 
problems. 
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